Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ohadn/qm31 operations #1938

Open
wants to merge 1 commit into
base: ohadn/qm31_arithmetics-in-math_utils
Choose a base branch
from

Conversation

ohad-nir-starkware
Copy link
Collaborator

@ohad-nir-starkware ohad-nir-starkware commented Feb 9, 2025

QM31Operations

Description

add packed reduced qm31 add, mul, sub, div via OpcodeExtension::QM31Operations.

Description of the pull request changes and motivation.

Checklist

  • Linked to Github Issue
  • Unit tests added
  • Integration tests added.
  • This change requires new documentation.
    • Documentation has been added/updated.
    • CHANGELOG has been updated.

This change is Reviewable

Copy link

github-actions bot commented Feb 9, 2025

**Hyper Thereading Benchmark results**




hyperfine -r 2 -n "hyper_threading_main threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_main' -n "hyper_threading_pr threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 1
  Time (mean ± σ):     32.429 s ±  0.198 s    [User: 31.652 s, System: 0.775 s]
  Range (min … max):   32.289 s … 32.569 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 1
  Time (mean ± σ):     26.967 s ±  0.064 s    [User: 26.195 s, System: 0.771 s]
  Range (min … max):   26.922 s … 27.012 s    2 runs
 
Summary
  hyper_threading_pr threads: 1 ran
    1.20 ± 0.01 times faster than hyper_threading_main threads: 1




hyperfine -r 2 -n "hyper_threading_main threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_main' -n "hyper_threading_pr threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 2
  Time (mean ± σ):     18.087 s ±  0.061 s    [User: 31.547 s, System: 0.782 s]
  Range (min … max):   18.044 s … 18.130 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 2
  Time (mean ± σ):     14.955 s ±  0.063 s    [User: 26.302 s, System: 0.806 s]
  Range (min … max):   14.911 s … 15.000 s    2 runs
 
Summary
  hyper_threading_pr threads: 2 ran
    1.21 ± 0.01 times faster than hyper_threading_main threads: 2




hyperfine -r 2 -n "hyper_threading_main threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_main' -n "hyper_threading_pr threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 4
  Time (mean ± σ):     12.651 s ±  0.305 s    [User: 44.151 s, System: 0.950 s]
  Range (min … max):   12.435 s … 12.867 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 4
  Time (mean ± σ):     10.975 s ±  0.555 s    [User: 39.528 s, System: 0.997 s]
  Range (min … max):   10.583 s … 11.367 s    2 runs
 
Summary
  hyper_threading_pr threads: 4 ran
    1.15 ± 0.06 times faster than hyper_threading_main threads: 4




hyperfine -r 2 -n "hyper_threading_main threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_main' -n "hyper_threading_pr threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 6
  Time (mean ± σ):     12.156 s ±  0.302 s    [User: 44.754 s, System: 0.933 s]
  Range (min … max):   11.943 s … 12.370 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 6
  Time (mean ± σ):     11.098 s ±  0.005 s    [User: 39.360 s, System: 0.977 s]
  Range (min … max):   11.095 s … 11.102 s    2 runs
 
Summary
  hyper_threading_pr threads: 6 ran
    1.10 ± 0.03 times faster than hyper_threading_main threads: 6




hyperfine -r 2 -n "hyper_threading_main threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_main' -n "hyper_threading_pr threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 8
  Time (mean ± σ):     11.969 s ±  0.055 s    [User: 44.627 s, System: 1.007 s]
  Range (min … max):   11.929 s … 12.008 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 8
  Time (mean ± σ):     10.639 s ±  0.001 s    [User: 40.056 s, System: 0.991 s]
  Range (min … max):   10.639 s … 10.640 s    2 runs
 
Summary
  hyper_threading_pr threads: 8 ran
    1.12 ± 0.01 times faster than hyper_threading_main threads: 8




hyperfine -r 2 -n "hyper_threading_main threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_main' -n "hyper_threading_pr threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 16
  Time (mean ± σ):     12.368 s ±  0.060 s    [User: 44.776 s, System: 1.110 s]
  Range (min … max):   12.325 s … 12.410 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 16
  Time (mean ± σ):     10.801 s ±  0.172 s    [User: 40.147 s, System: 1.108 s]
  Range (min … max):   10.680 s … 10.923 s    2 runs
 
Summary
  hyper_threading_pr threads: 16 ran
    1.15 ± 0.02 times faster than hyper_threading_main threads: 16


@ohad-nir-starkware ohad-nir-starkware changed the base branch from main to ohadn/opcode_extension February 9, 2025 19:17
Copy link

github-actions bot commented Feb 9, 2025

Benchmark Results for unmodified programs 🚀

Command Mean [s] Min [s] Max [s] Relative
base big_factorial 2.223 ± 0.025 2.197 2.277 1.08 ± 0.01
head big_factorial 2.061 ± 0.013 2.042 2.079 1.00
Command Mean [s] Min [s] Max [s] Relative
base big_fibonacci 2.189 ± 0.053 2.148 2.330 1.09 ± 0.03
head big_fibonacci 2.003 ± 0.022 1.982 2.060 1.00
Command Mean [s] Min [s] Max [s] Relative
base blake2s_integration_benchmark 8.075 ± 0.038 7.985 8.117 1.08 ± 0.01
head blake2s_integration_benchmark 7.494 ± 0.039 7.455 7.580 1.00
Command Mean [s] Min [s] Max [s] Relative
base compare_arrays_200000 2.306 ± 0.065 2.261 2.487 1.07 ± 0.03
head compare_arrays_200000 2.150 ± 0.019 2.130 2.193 1.00
Command Mean [s] Min [s] Max [s] Relative
base dict_integration_benchmark 1.519 ± 0.015 1.505 1.549 1.05 ± 0.01
head dict_integration_benchmark 1.441 ± 0.004 1.434 1.448 1.00
Command Mean [s] Min [s] Max [s] Relative
base field_arithmetic_get_square_benchmark 1.287 ± 0.012 1.267 1.310 1.07 ± 0.01
head field_arithmetic_get_square_benchmark 1.203 ± 0.005 1.195 1.212 1.00
Command Mean [s] Min [s] Max [s] Relative
base integration_builtins 8.158 ± 0.071 8.052 8.297 1.08 ± 0.01
head integration_builtins 7.524 ± 0.032 7.464 7.583 1.00
Command Mean [s] Min [s] Max [s] Relative
base keccak_integration_benchmark 8.411 ± 0.084 8.293 8.568 1.07 ± 0.01
head keccak_integration_benchmark 7.830 ± 0.047 7.750 7.892 1.00
Command Mean [s] Min [s] Max [s] Relative
base linear_search 2.292 ± 0.024 2.267 2.351 1.07 ± 0.02
head linear_search 2.149 ± 0.029 2.121 2.217 1.00
Command Mean [s] Min [s] Max [s] Relative
base math_cmp_and_pow_integration_benchmark 1.595 ± 0.008 1.584 1.611 1.06 ± 0.01
head math_cmp_and_pow_integration_benchmark 1.502 ± 0.008 1.494 1.517 1.00
Command Mean [s] Min [s] Max [s] Relative
base math_integration_benchmark 1.542 ± 0.018 1.525 1.591 1.06 ± 0.01
head math_integration_benchmark 1.451 ± 0.005 1.443 1.458 1.00
Command Mean [s] Min [s] Max [s] Relative
base memory_integration_benchmark 1.276 ± 0.009 1.262 1.292 1.06 ± 0.02
head memory_integration_benchmark 1.202 ± 0.016 1.186 1.226 1.00
Command Mean [s] Min [s] Max [s] Relative
base operations_with_data_structures_benchmarks 1.654 ± 0.024 1.640 1.722 1.06 ± 0.02
head operations_with_data_structures_benchmarks 1.554 ± 0.009 1.545 1.579 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
base pedersen 550.7 ± 2.5 546.4 556.3 1.03 ± 0.01
head pedersen 533.6 ± 7.1 528.4 553.2 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
base poseidon_integration_benchmark 670.3 ± 6.9 659.8 680.9 1.05 ± 0.02
head poseidon_integration_benchmark 640.6 ± 8.3 633.0 661.1 1.00
Command Mean [s] Min [s] Max [s] Relative
base secp_integration_benchmark 1.910 ± 0.027 1.889 1.982 1.04 ± 0.02
head secp_integration_benchmark 1.838 ± 0.017 1.823 1.875 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
base set_integration_benchmark 648.9 ± 3.0 644.7 652.8 1.00
head set_integration_benchmark 653.7 ± 4.5 648.9 663.7 1.01 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base uint256_integration_benchmark 4.504 ± 0.024 4.441 4.522 1.08 ± 0.01
head uint256_integration_benchmark 4.183 ± 0.047 4.144 4.274 1.00

Copy link

codecov bot commented Feb 9, 2025

Codecov Report

Attention: Patch coverage is 89.65517% with 15 lines in your changes missing coverage. Please review.

Project coverage is 96.39%. Comparing base (9cdce6f) to head (286ffff).

Files with missing lines Patch % Lines
vm/src/types/relocatable.rs 86.27% 7 Missing ⚠️
vm/src/vm/vm_core.rs 90.27% 7 Missing ⚠️
vm/src/vm/decoding/decoder.rs 95.45% 1 Missing ⚠️
Additional details and impacted files
@@                           Coverage Diff                            @@
##           ohadn/qm31_arithmetics-in-math_utils    #1938      +/-   ##
========================================================================
- Coverage                                 96.41%   96.39%   -0.03%     
========================================================================
  Files                                       102      102              
  Lines                                     41871    41993     +122     
========================================================================
+ Hits                                      40371    40479     +108     
- Misses                                     1500     1514      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_operations branch 2 times, most recently from efdb5e3 to 6dd3870 Compare February 10, 2025 17:30
@ohad-nir-starkware ohad-nir-starkware changed the base branch from ohadn/opcode_extension to ohadn/u128_encoded_instr February 10, 2025 17:38
@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_operations branch 3 times, most recently from b4da427 to eb339d8 Compare February 10, 2025 18:50
@ohad-nir-starkware ohad-nir-starkware changed the base branch from ohadn/u128_encoded_instr to ohadn/qm31_arithmetics-in-math_utils February 10, 2025 18:51
@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_arithmetics-in-math_utils branch from d6d9a96 to d1f230b Compare February 10, 2025 21:47
@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_arithmetics-in-math_utils branch from d1f230b to 4bca3db Compare February 11, 2025 11:50
@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_arithmetics-in-math_utils branch from 4bca3db to ea9f93f Compare February 12, 2025 08:07
@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_operations branch 3 times, most recently from c9356e0 to c13233c Compare February 12, 2025 13:42
@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_arithmetics-in-math_utils branch from ea9f93f to ef1aa46 Compare February 12, 2025 17:42
@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_arithmetics-in-math_utils branch from ef1aa46 to cbd7519 Compare February 12, 2025 18:09
@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_arithmetics-in-math_utils branch from cbd7519 to ee011aa Compare February 12, 2025 20:12
@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_operations branch 3 times, most recently from 0de9b66 to 222bf81 Compare February 13, 2025 16:43
@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_arithmetics-in-math_utils branch from 86e011f to 668db19 Compare February 14, 2025 19:26
@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_arithmetics-in-math_utils branch from 4950171 to 42dabda Compare February 15, 2025 11:01
@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_arithmetics-in-math_utils branch from 42dabda to 597c04f Compare February 15, 2025 13:01
Copy link

@DavidLevitGurevich DavidLevitGurevich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 15 files at r2, 7 of 11 files at r3, all commit messages.
Reviewable status: 12 of 16 files reviewed, 2 unresolved discussions (waiting on @fmoletta, @gabrielbosio, @igaray, @juanbono, @Oppen, @pefontana, and @yuvalsw)


vm/src/vm/decoding/decoder.rs line 121 at r3 (raw file):

                || pc_update != PcUpdate::Regular
                || opcode != Opcode::AssertEq
            {

ap_update


vm/src/vm/vm_core.rs line 504 at r3 (raw file):

            .mark_as_accessed(operands_addresses.op1_addr);

        if instruction.opcode_extension == OpcodeExtension::Blake {

why did this make sense but with the QM31 we need a totally different approach? @JulianGCalderon

@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_arithmetics-in-math_utils branch from 597c04f to 84ec321 Compare February 18, 2025 08:11
@ohad-nir-starkware ohad-nir-starkware force-pushed the ohadn/qm31_arithmetics-in-math_utils branch from 84ec321 to 9cdce6f Compare February 18, 2025 11:38
Copy link

@DavidLevitGurevich DavidLevitGurevich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 6 files at r4.
Reviewable status: 9 of 16 files reviewed, 2 unresolved discussions (waiting on @fmoletta, @gabrielbosio, @igaray, @juanbono, @ohad-nir-starkware, @Oppen, @pefontana, and @yuvalsw)

Copy link
Collaborator Author

@ohad-nir-starkware ohad-nir-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 17 files reviewed, 2 unresolved discussions (waiting on @DavidLevitGurevich, @fmoletta, @gabrielbosio, @igaray, @juanbono, @JulianGCalderon, @Oppen, @pefontana, and @yuvalsw)


vm/src/vm/vm_core.rs line 504 at r3 (raw file):

Previously, DavidLevitGurevich wrote…

why did this make sense but with the QM31 we need a totally different approach? @JulianGCalderon

I refactored deduce_op0, deduce_op1, compute_res, the flow is now more uniform for Stone and QM31Operation.
However, it did import OpcodeExtension to relocatable.rs which is something I feel unsure about.

Also, in deduce_op0 and deduce_op1 for Res::Mul the flows of Stone and QM31Operation do diverge within the function itself because it already deconstructs Felts out of MaybeRelocatable making it unsuitable to be computed inside relocatable.rs.
I could place the computation inside math_utils but it will introduce OpcodeExtension to math_utils and also separate the computation of div from that of add, sub, mul.
Another option is to open another utils file specifically for those 4 functions or to place them inside vm_core.rs.

let me know what you think.


vm/src/vm/decoding/decoder.rs line 121 at r3 (raw file):

Previously, DavidLevitGurevich wrote…

ap_update

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants